Transductive Inference for Text Classi cation using Support Vector Machines
نویسنده
چکیده
This paper introduces Transductive Support Vector Machines (TSVMs) for text classi cation. While regular Support Vector Machines (SVMs) try to induce a general decision function for a learning task, Transductive Support Vector Machines take into account a particular test set and try to minimize misclassi cations of just those particular examples. The paper presents an analysis of why TSVMs are well suited for text classi cation. These theoretical ndings are supported by experiments on three test collections. The experiments show substantial improvements over inductive methods, especially for small training sets, cutting the number of labeled training examples down to a twentieth on some tasks. This work also proposes an algorithm for training TSVMs e ciently, handling 10,000 examples and more.
منابع مشابه
Prediction of soil cation exchange capacity using support vector regression optimized by genetic algorithm and adaptive network-based fuzzy inference system
Soil cation exchange capacity (CEC) is a parameter that represents soil fertility. Being difficult to measure, pedotransfer functions (PTFs) can be routinely applied for prediction of CEC by soil physicochemical properties that can be easily measured. This study developed the support vector regression (SVR) combined with genetic algorithm (GA) together with the adaptive network-based fuzzy infe...
متن کاملSupport Vector Learning for Fuzzy Rule - Based Classi cation Systems
|To design a fuzzy rule-based classi cation system (fuzzy classi er) with good generalization ability in a high dimensional feature space has been an active research topic for a long time. As a powerful machine learning approach for pattern recognition problems, support vector machine (SVM) is known to have good generalization ability. More importantly, an SVM can work very well on a high (or e...
متن کاملTransductive Support Vector Machines
In contrast to learning a general prediction rule, V. Vapnik proposed the transductive learning setting where predictions are made only at a fixed number of known test points. This allows the learning algorithm to exploit the location of the test points, making it a particular type of semi-supervised learning problem. Transductive support vector machines (TSVMs) implement the idea of transducti...
متن کاملKeyword Spotting from Online Chinese Handwritten Documents using One-versus-All Character Classification Model
In this paper, we propose a method for text-query-based keyword spotting from online Chinese handwritten documents using character classi ̄cation model. The similarity between the query word and handwriting is obtained by combining the character classi ̄cation scores. The classi ̄er is trained by one-versus-all strategy so that it gives high similarity to the target class and low scores to the oth...
متن کاملClient Dependent GMM-SVM Models for Speaker Veri cation
Generative Gaussian Mixture Models (GMMs) are known to be the dominant approach for modeling speech sequences in text independent speaker veri cation applications because of their scalability, good performance and their ability in handling variable size sequences. On the other hand, because of their discriminative properties, models like Support Vector Machines (SVMs) usually yield better perfo...
متن کامل